Intro to geospatial data & coordinate reference systems
Load spatial data in R
Explore and map the data
Basic transformations
Practice
October 16, 2017
Intro to geospatial data & coordinate reference systems
Load spatial data in R
Explore and map the data
Basic transformations
Practice
Download the workshop files
Install any required libraries
Who are you?
Why are you here?
Data about locations on or near the surface of the Earth.
![]()
Convey geographic information but don't specify location on the surface of the Earth.
![]()
represent location geometrically with coordinates
46.130479, -117.134167

Coordinates indicate specific locations on the Earth when associated with a geographic coordinate reference system or CRS.
![]()
Specifies
Because of variations in 1-3, there are many geographic CRSs!
The World Geodetic System of 1984 is the default geographic CRS used today.
A Projected CRS applies a map projection to a Geographic CRS
Map Projection: mathematial transformation from curved to flat surface

There are many, many projected CRSs
All introduce distortion, eg in shape, area, distance, direction
No best one for all purposes
Selection depends on location, extent and purpose

Spatial data is a more generic term that is not just for geographic data.
Spatial data are powerful because
Points, lines and Polygons

Regular grid of cells (or pixels)

There are many approaches to and packages for working with geospatial data in R.
One approach is to keep it simple and store geospatial data in a data frame
cafes <- read.csv('data/cafes.csv')
head(cafes)
## X name long lat taste price crowded food ## 1 1 babette -122.2554 37.86843 10 4.00 1 1 ## 2 2 musical -122.2607 37.86838 7 3.25 1 1 ## 3 3 starbucks -122.2661 37.87044 6 2.95 1 0 ## 4 4 yalis -122.2664 37.87353 7 2.95 0 0 ## 5 5 berkeleyesp -122.2687 37.87366 3 3.25 1 0 ## 6 6 fertile -122.2689 37.87493 5 3.25 0 1
plot(cafes$long,cafes$lat)
ggplot2 and ggmaplibrary(ggplot2) library(ggmap) ggplot() + geom_point(data=cafes, aes(long,lat), col="red", size=2)
There are advantages to storing geographic data as spatial objects.
Data cleaning! Before Mapping
CRS tranformations and geoprocessing
Compute spatial metrics and relationships
Data linkages - when data are in same CRS
sp PackageClasses and Methods for Spatial Data
The SP package is most commonly used to construct and manipulate spatial data objects in R.
Hundreds of other R packages that do things with spatial data typically build on SP objects.
sf packageThe sf, or simple features package in R has many improvements
Based on open standards for specifying spatial data
But most spatial packages still depend on sp
So, live on the bleeding edge or check back in a year or so.
sp packagelibrary(sp)
getClass("Spatial") # See all sp object types
## Class "Spatial" [package "sp"] ## ## Slots: ## ## Name: bbox proj4string ## Class: matrix CRS ## ## Known Subclasses: ## Class "SpatialPoints", directly ## Class "SpatialMultiPoints", directly ## Class "SpatialGrid", directly ## Class "SpatialLines", directly ## Class "SpatialPolygons", directly ## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2 ## Class "SpatialPixels", by class "SpatialPoints", distance 2 ## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2 ## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2 ## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2 ## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3 ## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2
```
| Geometry | Spatial Object | Spatial Object with Attributes |
|---|---|---|
| Points | SpatialPoints | SpatialPointsDataFrame |
| Lines | SpatialLines | SpatialLinesDataFrame |
| Polygons | SpatialPolygons | SpatialPolygonsDataFrame |
We use the S*DF objects most frequently!
rentals <- read.csv('data/sf_airbnb_2bds.csv')
class(rentals)
## [1] "data.frame"
dim(rentals)
## [1] 1206 14
str(rentals)
## 'data.frame': 1206 obs. of 14 variables: ## $ id : int 91957 18139835 172943 2309140 2559297 149108 10832295 15134083 283235 8013423 ... ## $ name : Factor w/ 1203 levels " Victorian Suite in Inner Mission",..: 1091 267 917 1128 479 904 349 696 1064 976 ... ## $ latitude : num 37.8 37.8 37.8 37.8 37.8 ... ## $ longitude : num -122 -122 -122 -122 -122 ... ## $ property_type : Factor w/ 4 levels "Apartment","Condominium",..: 1 1 3 1 1 3 2 1 1 1 ... ## $ room_type : Factor w/ 1 level "Entire home/apt": 1 1 1 1 1 1 1 1 1 1 ... ## $ accommodates : int 6 4 3 4 4 3 6 4 4 4 ... ## $ bathrooms : num 1.5 1 1 1 1 1 1 1 1 2 ... ## $ bedrooms : int 2 2 2 2 2 2 2 2 2 2 ... ## $ beds : int 4 2 2 2 2 2 2 2 2 2 ... ## $ price : int 300 275 279 395 235 300 450 349 199 235 ... ## $ review_scores_rating: int 94 100 100 97 86 100 100 92 96 93 ... ## $ neighbourhood : Factor w/ 52 levels "Alamo Square",..: 7 52 7 52 52 7 52 26 52 20 ... ## $ listing_url : Factor w/ 1206 levels "https://www.airbnb.com/rooms/1003756",..: 1156 490 446 666 689 293 55 311 711 1073 ...
head(rentals)
## id name latitude longitude ## 1 91957 Sunny SF Flat with 5*s 37.76194 -122.4493 ## 2 18139835 Brand New Apartment in the Heart of NOPA! 37.77568 -122.4453 ## 3 172943 Redwood Cottage on Ashbury Heights 37.76022 -122.4495 ## 4 2309140 The Lady Di 37.77524 -122.4394 ## 5 2559297 Entire 2BR w/pkg centrally-located! 37.77507 -122.4409 ## 6 149108 QUIET LUXURY COLE VALLEY PARKING 37.76101 -122.4507 ## property_type room_type accommodates bathrooms bedrooms beds price ## 1 Apartment Entire home/apt 6 1.5 2 4 300 ## 2 Apartment Entire home/apt 4 1.0 2 2 275 ## 3 House Entire home/apt 3 1.0 2 2 279 ## 4 Apartment Entire home/apt 4 1.0 2 2 395 ## 5 Apartment Entire home/apt 4 1.0 2 2 235 ## 6 House Entire home/apt 3 1.0 2 2 300 ## review_scores_rating neighbourhood ## 1 94 Cole Valley ## 2 100 Western Addition/NOPA ## 3 100 Cole Valley ## 4 97 Western Addition/NOPA ## 5 86 Western Addition/NOPA ## 6 100 Cole Valley ## listing_url ## 1 https://www.airbnb.com/rooms/91957 ## 2 https://www.airbnb.com/rooms/18139835 ## 3 https://www.airbnb.com/rooms/172943 ## 4 https://www.airbnb.com/rooms/2309140 ## 5 https://www.airbnb.com/rooms/2559297 ## 6 https://www.airbnb.com/rooms/149108
hist(rentals$price)
cheap <- subset(rentals, price < 401) hist(cheap$price)
cheap_good <- subset(cheap, review_scores_rating > 98) hist(cheap$price)
Use the sp::coordinates() method
Requires a vector indicating the x,y columns
SP will create a SpatialPointsData Fram from csv
#First make a copy
cheap_good_orig <- cheap_good
coordinates(cheap_good) <- c('longitude','latitude')
class(cheap_good)
## [1] "SpatialPointsDataFrame" ## attr(,"package") ## [1] "sp"
str(cheap_good_orig)
## 'data.frame': 374 obs. of 14 variables: ## $ id : int 18139835 172943 149108 10720392 14594885 15546103 13516315 10058568 1721354 3585034 ... ## $ name : Factor w/ 1203 levels " Victorian Suite in Inner Mission",..: 267 917 904 504 448 323 948 314 519 389 ... ## $ latitude : num 37.8 37.8 37.8 37.8 37.8 ... ## $ longitude : num -122 -122 -122 -122 -122 ... ## $ property_type : Factor w/ 4 levels "Apartment","Condominium",..: 1 3 3 1 2 3 1 1 3 1 ... ## $ room_type : Factor w/ 1 level "Entire home/apt": 1 1 1 1 1 1 1 1 1 1 ... ## $ accommodates : int 4 3 3 6 4 4 4 4 4 4 ... ## $ bathrooms : num 1 1 1 1.5 2 1 2 2 1 1.5 ... ## $ bedrooms : int 2 2 2 2 2 2 2 2 2 2 ... ## $ beds : int 2 2 2 2 2 2 3 3 2 2 ... ## $ price : int 275 279 300 250 200 225 350 215 389 200 ... ## $ review_scores_rating: int 100 100 100 100 99 100 100 100 100 100 ... ## $ neighbourhood : Factor w/ 52 levels "Alamo Square",..: 52 7 7 20 52 20 7 7 20 7 ... ## $ listing_url : Factor w/ 1206 levels "https://www.airbnb.com/rooms/1003756",..: 490 446 293 41 268 336 200 3 439 776 ...
str(cheap_good)

You can see from str(cheap_good) that a SPDF object is a collection of slots or components. The key ones are:
@data data frame of attributes that describe each location@coords the coordinates for each location@bbox the min and max bounding coordinates@proj4string the coordinate reference system defintion as a stringReview the output of each of these:
summary(cheap_good) head(cheap_good@coords) head(cheap_good@data) cheap_good@bbox cheap_good@proj4string
Are all the columns that were present in the DF now in the SPDF?
Is there a slot without data?
cheap_good@proj4string # get a CRS object
## CRS arguments: NA
Defining a CRS to a spatial data object - locates the coordinates on the surface of the Earth
You need to know
Known as defining a projection in ArcGIS
Defining a CRS != Transforming CRS
We will get to transformations soon!
Need the proj4 string that defines the CRS
# Create a CRS object
WGS84_CRS <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
# Set the CRS of the SPDF
proj4string(cheap_good) <- WGS84_CRS
# check it
cheap_good@proj4string
## CRS arguments: ## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
# or use the EPSG code directly
proj4string(cheap_good) <- CRS("+init=epsg:4326")
# or enter the string
proj4string(cheap_good) <- CRS("+proj=longlat
+ellps=WGS84 +datum=WGS84 +no_defs")
What happens if we assign the wrong CRS?
# # 36910 is the EPSG code for UTM10 NAD83
proj4string(cheap_good) <- CRS("+init=epsg:26910")
What happens if we assign the wrong CRS?
Software doesn't care but you need to!
See http://spatialreference.org/
Use this site to find EPSG codes and proj4 CRS strings
Geographic
4326 Geographic, WGS84
4269 Geographic, NAD83 (USA)
Projected
5070 USA Contiguous Albers Equal Area Conic
3310 CA ALbers Equal Area
26910 UTM Zone 10, NAD83 (Northern Cal)
3857 Web Mercator (web maps)
Use http://spatialreference.org/ to make an educated guess as to the CRS of these coordinates:
X = 549228.058249, Y = 4176578.444299
Strategy:
What are the units for that CRS?
sp includes a plotting method spplot
You can use it to create great maps but it is very low level
which means, complex, non-intuitive syntax, long code
spplot the Dataspplot(cheap_good)
## Warning in stack.data.frame(df[zcol]): non-vector columns will be ignored
spplot the Dataspplot(cheap_good,"price")
Use spplot to create data maps from some of the other columns in the @data slot
Getting help:
?spplot
spplot(cheap_good,"bathrooms") spplot(cheap_good,"accommodates") spplot(cheap_good,"property_type") spplot(cheap_good,"neighbourhood")
Vector points, lines & polygons:
Raster grids
This is one of the most, if not the most common spatial vector data file formats.
Old but everywhere!
Gotchas: 2GB limit, 8char column names
There's an R package for that!
rgdalrgdal is an R port of the powerful and widely used GDAL library.
It is the most commonly used R library for importing and exporting spatial data.
OGR: for vector data: readOGR() and writeOGR()
GDAL for raster data: readGDAL() and writeGDAL()
rgdallibrary(rgdal)
## rgdal: version: 1.2-8, (SVN revision 663) ## Geospatial Data Abstraction Library extensions to R successfully loaded ## Loaded GDAL runtime: GDAL 2.1.3, released 2017/20/01 ## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/gdal ## Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493] ## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/proj ## Linking to sp version: 1.2-4
# See what file types are supported by rgdal drivers # ogrDrivers()$name
gdal.org
`?readOGR
For more info on working with rgdal to load different types of spatial data in R see this excellent tutorial by Zev Ross.
sfboundary <- readOGR(dsn="data",layer="sf_boundary")
## OGR data source with driver: ESRI Shapefile ## Source: "data", layer: "sf_boundary" ## with 1 features ## It has 3 fields
sfboundary <- readOGR(dsn="data",layer="sf_boundary")
## OGR data source with driver: ESRI Shapefile ## Source: "data", layer: "sf_boundary" ## with 1 features ## It has 3 fields
# or
# sfboundary <- readOGR("data","sf_boundary")
# but not
#sfboundary <- readOGR(dsn="data/",layer="sf_boundary")
#str(sfboundary) summary(sfboundary)
## Object of class SpatialPolygonsDataFrame ## Coordinates: ## min max ## x 542696.6 556659.9 ## y 4173563.7 4185088.6 ## Is projected: TRUE ## proj4string : ## [+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 ## +towgs84=0,0,0] ## Data attributes: ## NAME Pop2010 Land_sqmi ## San Francisco:1 Min. :805235 Min. :46.87 ## 1st Qu.:805235 1st Qu.:46.87 ## Median :805235 Median :46.87 ## Mean :805235 Mean :46.87 ## 3rd Qu.:805235 3rd Qu.:46.87 ## Max. :805235 Max. :46.87
How?
plot(sfboundary)
How?
head(sfboundary@data)
## NAME Pop2010 Land_sqmi ## 0 San Francisco 805235 46.87
How?
Is this a geographic or projected CRS?
sfboundary@proj4string
## CRS arguments: ## +proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 ## +towgs84=0,0,0
plot(sfboundary) points(cheap_good, col="red")
## Plot with Rentals
Where are the points?
plot(sfboundary) points(cheap_good, col="red")
Compare the CRSs, are they the same?
proj4string(sfboundary) proj4string(cheap_good) proj4string(sfboundary) == proj4string(cheap_good)
sfboundary@bbox
## min max ## x 542696.6 556659.9 ## y 4173563.7 4185088.6
cheap_good@bbox
## min max ## longitude -122.50647 -122.38947 ## latitude 37.70738 37.80646
All geospatial data should have the same CRS.
Geospatial data transformations are super common.
Most common transformation is the CRS.
This is also called a projection transformation, or reprojection.
Use sp function spTransform
Requires as input
Outputs a new spatial object with coordinate data in the target CRS
sfboundarysf_lonlat <- spTransform(sfboundary, WGS84_CRS)
Use CRS to that of another data layer
sf_lonlat <- spTransform(sfboundary, CRS(proj4string(cheap_good)))
Use CRS string
sf_lonlat <- spTransform(sfboundary, CRS("+proj=longlat +ellps=WGS84
+datum=WGS84 +no_defs"))
USE CRS code
sf_lonlat <- spTransform(sfboundary, CRS("+init=epsg:4326"))
Use a CRS object
WGS84_CRS <- CRS(proj4string(cheap_good)) sf_lonlat <- spTransform(sfboundary, WGS84_CRS)
How will we know?
proj4string(cheap_good) == proj4string(sf_lonlat)
plot(sf_lonlat) points(cheap_good, col="red") points(cheap_good[cheap_good$price<100,], col="green", pch=19)
We want all data in the same CRS
Which one is best?
Let's read in some line data
In the popular GeoJSON file format

GeoJSON can be tricky!
sf_streets <- readOGR(dsn='data/sf_highways.geojson', layer="OGRGeoJSON")
## OGR data source with driver: GeoJSON ## Source: "data/sf_highways.geojson", layer: "OGRGeoJSON" ## with 246 features ## It has 5 fields
plot(sf_streets)
str(sf_streets) summary(sf_streets)
How do we do that?
plot(sf_lonlat) lines(sf_streets) points(cheap_good, col="red") points(cheap_good[cheap_good$price<200,], col="green")
rgdal::writeOGRCreated Spatial{Points, Lines, Polygons}DataFrames
Defined CRS with proj4string()
Transformed CRS with spTransform()
Mapped data with plot() and spplot
Mapped multilple geospatial data layers
Also called thematic maps or data maps
Use data values to determine symbology
Use symbology to convey data values / meaning
Lots of them
Let's quickly discuss the most common ones
plotPros
Cons
spplotPros
Cons
ggplot2 and ggmapPros
ggplot2ggplot2ggmapCons
sp objectstmapPros
sp objectsggplot2interactive tmapsCons
leafletPros
R port of the popular Javascript library
Allows one to create interactive maps that can be served on web
Highly customizeable!
Cons
tmap stands for thematic map
Nice maps with less code than the alternatives
Syntax should be familar to ggplot2 users, but simpler
tmap in a Nutshell is a good starting point
Load the library
library(tmap) ?tmap
Let's explore population data for US states.
The file is data/us_states_pop.shp
Challenge
Use readOGR to load it and plot to view it
uspop <- readOGR("./data", "us_states_pop")
## OGR data source with driver: ESRI Shapefile ## Source: "./data", layer: "us_states_pop" ## with 49 features ## It has 4 fields
# Review the Data #summary(uspop)
What type of data object is usopen
How many features does it contain?
How many attributes describe those features?
What is the CRS?
qtmMake a quick plot with default options
qtm(uspop)
qtm of the data valuesqtm(uspop,"POPULATION")
tmap Shapes and Thematic Elememtstmap's flexibility comes in how it intitively allows you to layer spatial data and style the layers by data attributes
Use tm_shape(<sp_object>) to specifiy a geospatial data layer
Add + tm_<element>(...) to style the layer by data values
…and other options for creating a publication ready map
tmap functionality?tm_polygons
vignette('tmap-nutshell')

tm_shape(uspop) + tm_polygons(col="beige", border.col = "red")
Customize the map with different fill and outline colors
You can use color names or HEX values
tm_shape(uspop) + tm_polygons(col="#f6e6a2", border.col = "white")
Notice anything odd about shape of USA?
tm_shape(uspop) + tm_polygons(col="#f6e6a2", border.col = "white")
Setting the CRS Dynamically
Is this better?
tm_shape(uspop, projection="+init=epsg:5070") + tm_polygons(col="#f6e6a2", border.col = "white")
Dynamic CRS transformations make operations slow or unsupported.
Transform the uspop data to the USA Contiguous Albers CRS (5070),
saving it as a new SpatialPolygonsDataFrame called uspop_5070
uspop_5070 <- spTransform(uspop, CRS("+init=epsg:5070"))
What's happening here?
tm_shape(uspop_5070) + tm_polygons(col="#f6e6a2", border.alpha = 0) + tm_shape(uspop) + tm_borders(col="purple")
Color areas by data values
tm_shape(uspop_5070) + tm_polygons(col="POPULATION")
Color Palettes
Data Classification
The R package RColorBrewer is often used to select color palettes.
Read the package help for more info.
?RColorBrewer or see colorbrewer.org
library(RColorBrewer) # ?RColorBrewer
You can use the display.brewer function to see the colors in a specific named palette.
For example:
display.brewer.all()
Qualitative - complementary colors, e.g. pastels, to emphasize different categories
Sequential - a range of different shades of the same color (hue) to imply higher to lower ranks or values
Divergent - two squential color palettes with a light or grey color in the middle; used to emphasize outliers
display.brewer.all(type="qual") display.brewer.all(type="seq") display.brewer.all(type="div")
Let's see how a BuPu (blue-purple) palette works with this data.
tm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="BuPu")
Try a sequential, divergent and qualitative color palette with this dataset.
Set auto.palette.mapping=FALSE and TRUE with the SPECTRAL palette.
?tm_polygonstm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="Spectral", auto.palette.mapping=FALSE)
Important for improving the cartographic display of continuous data
Unclassified maps scale full range of values to color palette
Great for exploring trends and outliers,
but hard to interpret specific data values.
The eye can only differentiate a few colors.
You can use a data classification method to reduce the complexity in your mapped data by classifying continuous values into small number of bins, typically 5-7.
Unique symbology - color - is then associated with each bin.
A legend indicates the association, making it easier to interpret data values.
Common methods for binning data into a set of classes include:
equal interval: classes the data into bins of equal data ranges, e.g. 10-20, 20-30, 30-40.
quantile: classes the data into bins with an equal number of data observations. This is the default for most mapping software.
fisher/jenks/natural breaks: classes data into bins that minmizie within group variance and maximize between group variance.
standard devation: classes emphasize outliers by classing data into bins based on standard deviations around the mean, eg -2 to + 2 SDs.
tmaptmap uses the classificaiton methods in the classIntervals package
The class method is set by the style parameter in tm_polygons - or other style element, eg tm_symbols
See ?tm_polygons for keyword options
tmap data classificationtm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="BuPu", style="jenks")
Try several of the different classification schemes.
Note how they impact the display of the data and thus communicate different messages
What is the default classification scheme used by tmap?
In addition to setting the classification method you can set the number of classes, or cuts.
By default this is 5
Explore setting it to 3 or 9 with n=
tm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="BuPu", style="jenks", n=9)
The Art and Science of Map Making
Science:
Art:
Map popdens instead of POPULATION
Don't use defaults
tm_shape(uspop_5070) + tm_polygons(col=c("POPULATION","popdens"),
style=c("jenks","jenks"))
Sometimes polygons get in the way of what you want to communicate
Why?
What state has the greatest population density?
You can use tm_symbols instead of tm_polygons to map the data as points
tm_shape(uspop_5070) + tm_symbols(col="popdens", style="jenks")
tm_shape(uspop_5070) + tm_polygons(col="white", border.alpha = 0.5) + tm_shape(uspop_5070) + tm_symbols(col="popdens", style="jenks")
Graduated Color Map when data are classified
Proportional Color Map when they are not
Change the marker shape to square
Vary symbol size by data value
tm_shape(uspop_5070) + tm_polygons(col="white", border.alpha = 0.5) + tm_shape(uspop_5070) + tm_symbols(size="popdens", style="jenks")
Take a look at ?tm_symbols and adjust the graduated symbol map
Also update the legend title
tm_shape(uspop_5070) + tm_polygons(col="white", border.alpha = 0.5) + tm_shape(uspop_5070) + tm_symbols(size="popdens", style="sd", size.lim=c(1,500))
tmap has two modes: plot and view
The plot mode is the default and is for creating static maps.
The view mode is for creating interactive maps using leaflet.
You can switch back and forth with the tm_mode() function
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(uspop) + tm_polygons(col="POPULATION", style="jenks")
?tmap_mode
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(uspop) + tm_polygons(col="POPULATION", style="jenks", popup.vars=c("NAME","POPULATION","popdens"))
Return to plot mode
tmap_mode('plot')
## tmap mode set to plotting
See vignette("tmap-modes") for more on interactive maps.
tm_saveYou can save static and interactive maps with tm_save
See: ?tm_save for details or vignette("tmap-nutshell")
Use the R leaflet package for more customized leaflet maps,
See the RStudio Leaflet tutorial.
tm_shape(uspop ) + tm_polygons() + tm_scale_bar(position=c("left","bottom")) + tm_compass() + tm_style_classic()
``` ## Challenge
See ?tm_symbols to see what parameter you would add to make the symbol sizes smaller
See ?tm_lines to see how you would increase the thickness (or weight) of the street lines
See ?tm_polygons to see how you would change the fill and border color and add transparency to the fill
Bonus - what parameter adjusts transparency?
tm_shape(sf_lonlat) + tm_polygons(col="beige", border.col = "blue") + tm_shape(sf_streets) + tm_lines(col="black", lwd = 3) + tm_shape(sf_streets) + tm_lines(col="white", lwd = 1) + tm_shape(cheap_good) + tm_symbols(col="red", size = 0.5, alpha=0.5)
If yes, what does it tell you?
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(col="red")
Mapping categorical (or qualitative) data with
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(shape="property_type")
Using color and size to convey data values
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(size="price")
Redo that map - set background color to transparent - set legend title to Price per Night
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(size="price", title.size = "Price per Night", alpha = 0)
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(col="price")
What worked?
What didn't?
How to improve?
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(col="price", palette = "YlOrRd")
Change the color palette to Blues and Purples (squential)
Change the color palette to Spectral (divergent) using red for higher prices.
Use display.brewer.all and the Examples in ?tm_symbols for help
Hint: you may need to set auto.palette.mapping for Spectral colors
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(col="price", size=0.5, palette = "BuPU", auto.palette.mapping=FALSE) tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(col="price", size=0.5, palette = "-Spectral", auto.palette.mapping=FALSE)
Layers draw in the order that they are added to the map.
So, should it be
Data will display on the map in the order it appears in the data frame.
You can sort the data by a column value to control this.
Question: Would you order ascending or descending to better see cheap rentals?
# sort data frame by price cheap_good <- cheap_good[order(cheap_good$price, decreasing = TRUE),] tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(col="price", size=0.5, palette = "-Spectral", auto.palette.mapping=FALSE)
tm_shape(sfboundary) + tm_polygons(col="white") + tm_shape(cheap_good) + tm_symbols(size="price", style="quantile", border.col="red", alpha=0, title.col="Price per Night")
tm_shape(sfboundary) + tm_polygons(col="white") + tm_shape(cheap_good) + tm_symbols(size="price", border.col="red", alpha=0, title.col="Price per Night", perceptual=T)
tm_shape(sfboundary) + tm_polygons(col="white") + tm_shape(cheap_good) + tm_symbols(size="price", style="sd", border.col="red", alpha=0, title.col="Price per Night")
Try some other classification schemes for your graduated symbol map
Note how the affect the display of the data
Hold size constant and vary color to create a Graduated color map
Choice of color palette is very important
tm_shape(sfboundary) +
tm_polygons(col="#fbf9f1") +
tm_shape(cheap_good) + tm_symbols(col="price", style="jenks", size=.5,
palette="Reds", auto.palette.mapping=F,
border.alpha=0, alpha=0.75, title.col="Price per Night")
Try different classification methods
Note how they change the
tm_shape(sfboundary) +
tm_polygons(col="white") +
tm_shape(cheap_good) + tm_symbols(col="price", style="equal",size=.5,
palette="Reds", auto.palette.mapping=F,
border.alpha=0, alpha=0.75, title.col="Airbnb 2bd Price")
Color areas by data values
The polygon version of the graduated color map
uspop <- readOGR("./data", "us_states_pop")
## OGR data source with driver: ESRI Shapefile ## Source: "./data", layer: "us_states_pop" ## with 49 features ## It has 4 fields
plot(uspop)
Add - scalebar - north arrow
tm_shape(uspop) + tm_polygons() + tm_scale_bar(position=c("left","bottom")) + tm_compass() + tm_style_classic()
Natural
ALbatross
Beaver
etc…
tmaptm_shape(uspop) + tm_polygons(col="POPULATION")
tm_shape(uspop) + tm_polygons(col="POPULATION", style="jenks")
What's the CRS?
Why might that matter?
When larger areas are mapped using geographic coordinates the distortion becomes more noticeable.
So these data are typically transformed to another CRS that better preserves shape.
The Albers equal area conic is the typical projection for historical USGS maps of the lower 48
It being a general-purpose low-distortion compromise for mid-latitude short and wide extents
Preserves relative shape
uspop_albers <- spTransform(uspop, CRS("+init=epsg:5070"))
tm_shape(uspop_albers) + tm_polygons(col="POPULATION", style="jenks")
Map population density instead of population
tm_shape(uspop_albers) + tm_polygons(col="popdens", style="jenks") #vs tm_shape(uspop_albers) + tm_polygons(col="POPULATION", style="jenks")
Map the states as points not polygons
How?
tm_shape(uspop_albers) + tm_polygons(col="white",border.alpha = 0.5)+ tm_shape(uspop_albers) + tm_symbols(col="popdens", style="jenks")
bigmap <- tm_shape(sfboundary) +
tm_polygons(col="beige") +
tm_shape(cheap_good) +
tm_symbols(size="coit_dist", title.size="Distance to Coit Tower (KM)", col="price", title.col="Price", shape="property_type", title.shape="Property Type") +
tm_layout( legend.bg.color="white",inner.margins=c(.05,.05, .15, .25), title="Airbnb 2 Bedroom Rentals, San Francisco Fall 2017", legend.position=c("right","center"))
bigmap
leaflet(cheap_good) %>% addTiles() %>%
addCircleMarkers(data = cheap_good, radius = 5, stroke=F,
color = "purple", fillOpacity = 0.75
)
library(RColorBrewer) display.brewer.all()
pal <- colorQuantile("Reds",NULL,5)
leaflet(cheap_good) %>% addTiles() %>%
addCircleMarkers(
data = cheap_good,
radius = 6,
color = ~pal(price),
stroke = F,
fillOpacity = 0.75
)
popup_content <- cheap_good$name popup_content <- paste0(popup_content, "<br>Price per night: $", cheap_good$price) popup_content <- paste0(popup_content, "<br><a href=",cheap_good$listing_url,">More info...</a>")
leaflet(cheap_good) %>% addTiles() %>%
addCircleMarkers(
data = cheap_good,
radius = 6,
color = ~pal(price),
stroke = F,
fillOpacity = 0.75,
popup = popup_content)
sfThink about it
Try it and see
Check the help ?spDist
coit_tower <- c("-122.405837,37.802032")
cheap_good$coit_dist <-
spDistsN1(cheap_good,c(-122.405837,37.802032), longlat = T)
head(cheap_good@data)
library(knitr)
purl("r-geospatial-workshop-f2017.Rmd", output = "test2.R", documentation = 2)
“Visualisation” section of the CRAN Task View # see the Quick-R website's page for graphical parameters.